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Abstract 

Low-level  edge  detection  in  optical  imagery 
can  be  problematic  in  the  ATR  domain  where 
highly  complex  scenes  are  the  norm.  Feature 
detection  algorithms  typically  take  a  global  ap¬ 
proach,  resulting  in  the  discovery  of  many  frag¬ 
mented  lines  which  are  not  directly  related  to 
stored  model  information.  For  this  domain,  we 
have  taken  a  top-down  approach  which  searches 
an  optical  image  for  the  locally  optimal  fea¬ 
tures  based  on  the  current  hypothesized  object 
pose.  The  resulting  linear  features  can  then  be 
matched  against  a  CAD  model. 

1  Introduction 

Edge  detection  in  the  Automatic  Target  Recognition 
(ATR)  domain  should  be  driven  by  the  expectation  of 
which  model  features  are  assumed  to  be  visible  in  a  given 
image.  Using  a  hypothesized  model  pose  to  predict  vis¬ 
ible  features  from  a  CAD  model  [Mar96,  Ste95],  a  local 
optimization  procedure  is  used  to  find  the  corresponding 
and  consistent  data  features  in  the  image. 

The  process  differs  from  the  traditional  low-level 
bottom-up  edge  detection  process  [MH80,  Hil83]  which 
can  be  highly  error-prone  [Cla89].  The  main  problem 
with  bottom-up  detection  is  the  inability  to  deal  with 
large  amounts  of  scene  complexity  and  clutter.  Color 
imagery  in  the  ATR  domain  often  contains  many  differ¬ 
ent  structural  events  taking  place  simultaneously  (cam¬ 
ouflage  on  military  vehicles  set  against  natural  terrain 
is  an  excellent  example).  Current  edge  detection  algo¬ 
rithms  [BHR86,  LB83,  FL88]  do  not  deal  well  with  these 
type  of  scenes  and  will  produce  many  small  fragmented 
line  segments  which  can  easily  distract  a  model-based 
matching  system. 

Furthermore,  most  ATR  algorithms  require  that  the 
edges  supplied  have  some  physical  significance  relative 
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to  the  vehicle  in  the  scene  [Cla89].  The  model  match¬ 
ing  process  will  be  more  robust  if  there  is  a  one-to-one 
correspondence  between  extracted  data  lines  and  model 
features.  Our  experience  suggests  bottom-up  feature  ex¬ 
traction  can  not  meet  the  requirements  of  this  domain. 

Consequently,  we  take  a  top-down  approach  in  which 
the  current  set  of  model  features  drives  the  search  for 
line  segments  in  color  images.  Using  a  method  to  predict 
visible  model  lines  for  a  hypothesized  pose  [Mar96],  the 
model  features  can  be  projected  into  a  given  image  using 
known  sensor  characteristics  [BHP94].  Local  search  then 
maximizes  the  segment  orientation  and  position  based  on 
the  current  gradient  response.  A  similar  approach  has 
been  applied  using  gradient  descent  to  perturb  the  line 
segment  [SWF95]. 

2  Local  Search 

The  model- driven  approach  is  initialized  by  projecting 
the  predicted  3D  model  edges  [Mar96]  into  the  color  im¬ 
age.  An  error  function  uses  a  gradient  mask  oriented  to 
the  direction  of  the  model  edge  to  determine  the  underly¬ 
ing  changes  in  pixel  intensity.  The  error  function  is  then 
used  to  guide  a  local  search  algorithm  in  the  selection  of 
a  better  edge  position. 

2.1  Oriented  Gradient  Mask 

The  gradient  mask  is  constructed  by  rotating  the  first 
derivative  of  a  bi-variate  Gaussian  to  match  the  orienta¬ 
tion  of  the  current  model  edge.  There  are  many  prece¬ 
dents  both  for  using  tuned  edge  masks  [Can86]  and  the 
first  derivative  Gaussian  [TF86].  Others  have  also  used 
different  methods  to  obtain  gradient  estimates  based 
on  steerable  filters  [Shu94,  FA91]  for  use  in  bottom- 
up  edge  detection.  However,  contrary  to  other  ap¬ 
proaches  [FL88],  we  are  not  searching  for  the  maximum 
gradient  of  a  line  of  an  arbitrary  orientation,  but  rather 
the  gradient  for  the  orientation  of  the  current  model 
edge. 

The  horizontal  first  derivative  of  a  bi- variate  Gaussian 
is  given  by: 

G(a:b)  = —2ye~i<tTb  + (1) 
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a.  Silhouette  Line 


b.  Gradient  Mask  c.  Gradient  Response 


d.  Weight  Mask 


Figure  1:  Gradient  Mask  and  Response 


where  (a,  b)  represents  the  current  position  in  the  filter 
coordinate  system.  In  order  to  maximize  the  response  to 
an  arbitrary  orientation,  the  function  is  rotated  to  the 
orientation  of  the  given  model  line: 

a  =  i  cos  <f>  +  j  sin  0  (2) 

b  =  —  i  sin  0  +  j  cos  0  (3) 

where  0  is  the  angle  of  rotation  required,  and  (i,j)  are 
the  gradient  mask  positions  being  calculated.  Figure  1a. 
shows  a  model  edge  projected  into  a  color  image,  along 
with  the  gradient  mask  used  (Figure  lb)  to  obtain  the 
gradient  response  (Figure  lc). 


2.2  Defining  the  Error  Function 


The  weight  mask  (shown  in  Figure  lc)  for  the  current 
model  edge  is  then  convolved  with  the  response  to  the 
gradient  mask  for  each  pixel  lying  under  the  line: 


G  Line (*) 


Lineyb  Lineyb 

X  X!  \Grad(iJ)\-w(i,j) 
i=Linexa  j~Lineya 

Lintxb  Lineyb 

7-  X  X 

i—Linexa  j=Lineya 


(4) 


where  (Grad(i:  j))  is  the  gradient  mask  response,  and 
w(i,j)  is  a  weighting  mask  based  on  the  distance  of  the 
pixel  from  the  true  line,  thus  allowing  the  computation 
of  G Lineijz)  with  sub-pixel  accuracy  [Pin88].  A  threshold 
for  w(i,j)  neglects  pixels  lying  outside  some  radius.  The 
7  term  is  the  largest  expected  gradient  possible  for  the 
current  mask,  and  will  normalize  Gunei k)  to  the  range 
[0,1]  for  each  line  segment.  The  gradient  response  is 
then  converted  to  an  error  term: 


ELine(k)  =  (1  -  GW*))  (5) 


2.3  Defining  the  Neighborhoods 


The  local  search  algorithm  uses  the  set  of  moves  shown 
in  Figure  2  to  perturb  each  model  edge.  The  error, 
Ennc(k),  for  each  move  is  calculated,  and  the  best  move 
in  the  set  becomes  the  new  model  edge  position.  The 


initial  step  and  rotation  sizes  are  set  manually.  Once 
a  local  optimum  is  achieved,  the  move  sizes  are  halved 
and  the  process  continues.  Once  they  fall  below  a  certain 
threshold,  and  no  further  improvement  can  be  made,  the 
current  position  of  the  edge  is  returned  as  the  data  line 
corresponding  to  the  current  model  edge. 

3  Results 

The  local  search  algorithm  is  currently  being  used  in 
a  multi-sensor  object  recognition  algorithm  here  at  Col¬ 
orado  State  University  [Ant96].  The  results  of  the  search 
are  shown  in  Figure  3.  Figure  3a  shows  the  predicted 
model  edges  thought  to  be  visible  in  the  image  for  the 
given  pose  hypothesis.  Figure  3b  shows  the  data  seg¬ 
ments  extracted  for  matching  to  those  model  features. 
As  can  be  seen,  several  of  the  data  lines  do  not  cor¬ 
respond  directly  to  the  features  desired  for  matching. 
However,  they  are  good  enough  to  move  the  model  closer 
to  the  desired  location,  where  a  new  correspondence  can 
be  generated.  As  the  model  moves  closer  to  the  correct 
position,  the  local  search  will  find  better  features  in  the 
data  for  matching. 
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Figure  3:  Linear  Features  Detected 
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